Map Reduce: A Survey Paper on Recent Expansion

نویسندگان

  • Shafali Agarwal
  • Zeba Khanam
چکیده

A rapid growth of data in recent time, Industries and academia required an intelligent data analysis tool that would be helpful to satisfy the need to analysis a huge amount of data. MapReduce framework is basically designed to compute data intensive applications to support effective decision making. Since its introduction, remarkable research efforts have been put to make it more familiar to the users subsequently utilized to support the execution of massive data intensive applications. Our survey paper emphasizes the state of the art in improving the performance of various applications using recent MapReduce models and how it is useful to process large scale dataset. A comparative study of given models corresponds to Apache Hadoop and Phoenix will be discussed primarily based on execution time and fault tolerance. At the end, a high-level discussion will be done about the enhancement of the MapReduce computation in specific problem area such as Iterative computation, continuous query processing, hybrid database etc. Keywords—Map Reduce; Hadoop; Iterative Computation; Phoenix; Databases

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Developing a model for simulating urban expansion based on the concept of decision risk: A case study in Babol city

Today, the study of the spatial-temporal pattern of urban physical expansion and the identification of the parameters affecting the expansion play a crucial role in urban-related decision-making and long-term planning processes. Consequently, the use of precise and efficient methods to predict the physical expansion of urban areas is of great importance. The objective of present study is to pro...

متن کامل

A Survey on Accelerated Mapreduce for Hadoop

Big Data is defined by 3Vs which stands for variety, volume and velocity. The volume of data is very huge, data exists in variety of file types and data grows very rapidly. Big data storage and processing has always been a big issue. Big data has become even more challenging to handle these days. To handle big data high performance techniques have been introduced. Several frameworks like Apache...

متن کامل

Distributed Parameter Map-Reduce

This paper describes how to convert a machine learning problem into a series of map-reduce tasks. We study logistic regression algorithm. In logistic regression algorithm, it is assumed that samples are independent and each sample is assigned a probability. Parameters are obtained by maxmizing the product of all sample probabilities. Rapid expansion of training samples brings challenges to mach...

متن کامل

Survey on Load Balancing and Data Skew Mitigation in Mapreduce Applications

Since few years Map Reduce programming model have shown great success in processing huge amount of data. Map Reduce is a framework for data-intensive distributed computing of batch jobs. This data-intensive processing creates skew in Map Reduce framework and degrades performance by great value. This leads to greatly varying execution time for the Map Reduce jobs. Due to this varying execution t...

متن کامل

Map-Reduce Expansion of the ISGA Genomic Analysis Web Server

Biological sequence data can be subjected to a variety of analysis workflows to glean pertinent scientific insight. Recent advances in sequencing techniques have led to a deluge of biosequence data, which necessitates the use of high-performance computing resources in order to carry out analysis in a reasonable period of time. The tasks involved in creating and managing these computational jobs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015